feat(func/greatest-least): new config param ignore null for func greatest/least#35239
feat(func/greatest-least): new config param ignore null for func greatest/least#35239stephenkgu wants to merge 12 commits into3.0from
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the ignoreNullInGreatest configuration parameter, allowing the GREATEST and LEAST functions to skip NULL arguments. The changes include documentation updates in both English and Chinese, the addition of the configuration flag to the client and server components, and the implementation of the logic within the scalar function execution engine. Two significant issues were identified in the implementation: a potential stack buffer overflow in greatestLeastImpl due to a type mismatch when reading the ignoreNull flag, and a risk of accessing uninitialized memory in freeSCovertScarlarParams if an error occurs during parameter conversion. New test cases have been added to verify the feature and its interaction with existing configurations.
There was a problem hiding this comment.
Pull request overview
Adds a new client-scope configuration ignoreNullInGreatest to control whether GREATEST/LEAST skip NULL arguments, along with documentation updates and CI test coverage.
Changes:
- Introduces
ignoreNullInGreatestas a dynamic client config and wires it into function translation/execution by appending a runtime flag parameter. - Updates scalar execution for
GREATEST/LEASTto optionally skip row-level NULL values and NULL literals when configured. - Adds comprehensive scalar-function tests (base coverage + ignore-null coverage) and registers them in CI; updates EN/ZH docs and config-scope references.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/ci/cases.task | Adds the new scalar test cases to CI execution list. |
| test/cases/11-Functions/01-Scalar/test_fun_sca_greatest_least.py | New base behavior test coverage for GREATEST/LEAST. |
| test/cases/11-Functions/01-Scalar/test_fun_sca_greatest_least_ignorenull.py | New coverage for ignoreNullInGreatest behavior and interaction with compareAsStrInGreatest. |
| source/libs/scalar/src/sclfunc.c | Implements runtime NULL-skipping behavior in scalar execution for greatest/least. |
| source/libs/function/src/builtins.c | Appends the ignore-null flag as a trailing param so per-vnode runtime sees the client-scope setting. |
| source/common/src/tglobal.c | Adds the new client configuration option and dynamic-update wiring. |
| include/common/tglobal.h | Exposes tsIgnoreNullInGreatest global. |
| docs/zh/14-reference/03-taos-sql/22-function.md | Documents NULL behavior and new config option for GREATEST. |
| docs/zh/14-reference/01-components/08-config-scope.md | Lists ignoreNullInGreatest in config scope table. |
| docs/zh/14-reference/01-components/02-taosc.md | Documents ignoreNullInGreatest under taosc parameters. |
| docs/zh/14-reference/01-components/01-taosd.md | Adds ignoreNullInGreatest entry to the taosd parameter reference. |
| docs/en/14-reference/03-taos-sql/22-function.md | Documents NULL behavior and new config option for GREATEST. |
| docs/en/14-reference/01-components/08-config-scope.md | Lists ignoreNullInGreatest in config scope table. |
| docs/en/14-reference/01-components/02-taosc.md | Documents ignoreNullInGreatest under taosc parameters table. |
| docs/en/14-reference/01-components/01-taosd.md | Adds ignoreNullInGreatest entry to the taosd parameter reference table. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
source/libs/scalar/src/sclfunc.c:6206
- In the IsNullType early-return path, numOfRows can remain 0 when outputType is TSDB_DATA_TYPE_NULL because the loop breaks immediately ("if (IsNullType) break;"). This will return an output column with 0 rows even though the input params may have rows (e.g., scalar constants: 1 row; table scan: N rows). Compute numOfRows from the data inputs regardless of outputType, then fill that many NULLs.
bool IsNullType = outputType == TSDB_DATA_TYPE_NULL ? true : false;
// Compute row count and detect all-NULL-typed input case.
for (int32_t i = 0; i < dataInputNum; i++) {
if (IsNullType) {
break;
}
if (numOfRows != 0 && numOfRows != pInput[i].numOfRows && pInput[i].numOfRows != 1 && numOfRows != 1) {
qError("input rows not match, func:%s, rows:%d, %d", __FUNCTION__, numOfRows, pInput[i].numOfRows);
code = TSDB_CODE_TSC_INTERNAL_ERROR;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the ignoreNullInGreatest configuration parameter, which allows the GREATEST and LEAST functions to skip NULL arguments instead of returning NULL by default. The changes span documentation updates in both English and Chinese, client-side configuration handling, and core logic modifications in the function translator and scalar execution engine. Additionally, new test suites covering base functionality, NULL-skipping behavior, and performance benchmarks have been integrated into the CI process. Review feedback identified critical missing NULL checks for memory allocation and column data access within the scalar function implementation.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
source/libs/scalar/src/sclfunc.c:6221
- In greatestLeastImpl(), when outputType is TSDB_DATA_TYPE_NULL (which translateGreatestleast can set when ignoreNullInGreatest=0 and a NULL literal is present), IsNullType is true at initialization and the loop breaks immediately, leaving numOfRows==0. This makes the function return 0 rows (and does not mark any output rows NULL) even though callers created the output column with the correct row count (rowNum). Suggestion: always compute numOfRows from the data inputs regardless of IsNullType (e.g., remove the early break or move row-count calculation before the IsNullType short-circuit), then set all rows NULL when IsNullType is true.
bool IsNullType = outputType == TSDB_DATA_TYPE_NULL ? true : false;
// Compute row count and detect all-NULL-typed input case.
for (int32_t i = 0; i < dataInputNum; i++) {
if (IsNullType) {
break;
}
if (numOfRows != 0 && numOfRows != pInput[i].numOfRows && pInput[i].numOfRows != 1 && numOfRows != 1) {
qError("input rows not match, func:%s, rows:%d, %d", __FUNCTION__, numOfRows, pInput[i].numOfRows);
code = TSDB_CODE_TSC_INTERNAL_ERROR;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the ignoreNullInGreatest configuration parameter, allowing the GREATEST and LEAST functions to skip NULL arguments instead of returning NULL when any argument is NULL. The changes include documentation updates, configuration system integration, runtime logic implementation in sclfunc.c, and comprehensive test coverage. Feedback includes a potential null pointer dereference in greatestLeastImpl during memory allocation failure and a suggestion to unify error handling using the _return label.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the ignoreNullInGreatest configuration parameter for the GREATEST and LEAST functions, allowing users to toggle between MySQL-compatible NULL propagation and a mode that skips NULL arguments. The implementation includes updates to the documentation, configuration handling in both client and server, and the core scalar function logic. Feedback was provided regarding potential memory leaks in the greatestLeastImpl function, where inconsistent error handling using SCL_ERR_JRET might bypass the necessary resource cleanup at the _return label.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new configuration parameter ignoreNullInGreatest for the GREATEST and LEAST functions, allowing users to skip NULL arguments instead of returning NULL. The changes include updates to documentation, global configuration handling, the core implementation in greatestLeastImpl and vectorCompareAndSelect, and comprehensive test coverage. I have identified two areas for improvement: the numOfCols parameter type in vectorCompareAndSelect should be updated to int32_t for consistency, and an early return should be added to greatestLeastImpl when numOfRows is 0 to avoid unnecessary memory allocation and potential errors.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the ignoreNullInGreatest configuration parameter, which allows the GREATEST and LEAST functions to optionally skip NULL arguments. By default, the functions maintain MySQL compatibility by returning NULL if any argument is NULL; however, when this new flag is enabled, they compare only non-NULL values. The implementation includes updates to the configuration system, function translation logic to pass the flag to the runtime, and the core scalar execution engine to handle row-level NULL skipping. Additionally, the PR provides extensive documentation updates in both English and Chinese, along with new test suites for functional verification and performance regression testing. I have no feedback to provide.
JinqingKuang
left a comment
There was a problem hiding this comment.
Code Review 总结
审查发现 1 个 Critical 和 2 个 High 级别问题,建议修复后合并。
统计
- Critical: 1 个(内存安全)
- High: 2 个(内存泄漏 + 可维护性)
- Medium: 2 个(文档 + 测试覆盖)
详细问题见下方评论。
Code Review 总结审查发现 1 个 Critical 和 2 个 High 级别问题,建议修复后合并。 统计
问题 1: 🔴 Critical - 空指针解引用风险文件:
修复建议: if (pInput[inputNum - 1].columnData == NULL ||
pInput[inputNum - 1].columnData->pData == NULL ||
pInput[inputNum - 1].columnData->info.type == 0) {
qError("greatestLeast: invalid flag column, func:%s", __FUNCTION__);
code = TSDB_CODE_TSC_INTERNAL_ERROR;
goto _return;
}问题 2: 🟠 High - 内存泄漏风险文件: 如果 修复建议: _return:
freeSCovertScarlarParams(pCovertParams, outIdx); // 使用实际填充的数量
taosMemoryFree(resultColIndex);
return code;需要在函数开头将 问题 3: 🟠 High - 复杂嵌套逻辑文件:
建议重构: static int32_t vectorCompareAndSelect(SCovertScarlarParam *pParams, int32_t numOfRows,
int32_t numOfCols, int32_t *resultColIndex,
EOperatorType optr, bool ignoreNull) {
if (ignoreNull) {
return vectorCompareAndSelectIgnoreNull(pParams, numOfRows, numOfCols, resultColIndex, optr);
} else {
return vectorCompareAndSelectDefault(pParams, numOfRows, numOfCols, resultColIndex, optr);
}
}将两种模式拆分为独立函数,降低单个函数的复杂度。 问题 4: 🟡 Medium - 文档不一致文件: 英文文档只有一行简短描述,而中文文档( 建议:英文文档补充完整的参数说明格式,与中文文档保持一致的结构。 问题 5: 🟡 Medium - 测试覆盖不足文件: 当前测试缺少以下边界场景:
建议添加:
这些场景可以在后续 PR 中补充。 |
|
1, info.type 在 translate 的时候赋值,这里不是指针,即便数据被踩乱了也不会 segfault |
Description
Issue(s)
https://project.feishu.cn/taosdata_td/feature/detail/6511294180
Checklist
Please check the items in the checklist if applicable.